Identifying sequence-structure pairs undetected by sequence alignments.

نویسندگان

  • S Miyazawa
  • R L Jernigan
چکیده

We examine how effectively simple potential functions previously developed can identify compatibilities between sequences and structures of proteins for database searches. The potential function consists of pairwise contact energies, repulsive packing potentials of residues for overly dense arrangement and short-range potentials for secondary structures, all of which were estimated from statistical preferences observed in known protein structures. Each potential energy term was modified to represent compatibilities between sequences and structures for globular proteins. Pairwise contact interactions in a sequence-structure alignment are evaluated in a mean field approximation on the basis of probabilities of site pairs to be aligned. Gap penalties are assumed to be proportional to the number of contacts at each residue position, and as a result gaps will be more frequently placed on protein surfaces than in cores. In addition to minimum energy alignments, we use probability alignments made by successively aligning site pairs in order by pairwise alignment probabilities. The results show that the present energy function and alignment method can detect well both folds compatible with a given sequence and, inversely, sequences compatible with a given fold, and yield mostly similar alignments for these two types of sequence and structure pairs. Probability alignments consisting of most reliable site pairs only can yield extremely small root mean square deviations, and including less reliable pairs increases the deviations. Also, it is observed that secondary structure potentials are usefully complementary to yield improved alignments with this method. Remarkably, by this method some individual sequence-structure pairs are detected having only 5-20% sequence identity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recurrent structural RNA motifs, Isostericity Matrices and sequence alignments

The occurrences of two recurrent motifs in ribosomal RNA sequences, the Kink-turn and the C-loop, are examined in crystal structures and systematically compared with sequence alignments of rRNAs from the three kingdoms of life in order to identify the range of the structural and sequence variations. Isostericity Matrices are used to analyze structurally the sequence variations of the characteri...

متن کامل

Searching databases of conserved sequence regions by aligning protein multiple-alignments.

A general searching method for comparing multiple sequence alignments was developed to detect sequence relationships between conserved protein regions. Multiple alignments are treated as sequences of amino acid distributions and aligned by comparing pairs of such distributions. Four different comparison measures were tested and the Pearson correlation coefficient chosen. The method is sensitive...

متن کامل

Protein sequence-structure alignment based on site-alignment probabilities.

A protein sequence-structure alignment method for database searches is examined on how effectively this method together with a simple scoring function previously developed can identify compatibilities between sequences and structures of proteins. The scoring function consists of pairwise contact energies, repulsive packing potentials of residues for overly dense arrangement and short-range pote...

متن کامل

DBAli: a database of protein structure alignments

SUMMARY The DBAli database includes approximately 35000 alignments of pairs of protein structures from SCOP (Lo Conte et al., Nucleic Acids Res., 28, 257-259, 2000) and CE (Shindyalov and Bourne, Protein Eng., 11, 739-747, 1998). DBAli is linked to several resources, including Compare3D (Shindyalov and Bourne, http://www.sdsc.edu/pb/software.htm, 1999) and ModView (Ilyin and Sali, http://guitar...

متن کامل

fRMSDAlign: Protein Sequence Alignment Using Predicted Local Structure Information for Pairs with Low Sequence Identity

As the sequence identity between a pair of proteins decreases, alignment strategies that are based on sequence and/or sequence profiles become progressively less effective in identifying the correct structural correspondence between residue pairs. This significantly reduces the ability of comparative modelingbased approaches to build accurate structural models. Incorporating into the alignment ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Protein engineering

دوره 13 7  شماره 

صفحات  -

تاریخ انتشار 2000